AITopics | localization network

Object discovery is a core task in computer vision. While fast progresses have been made in supervised object detection, its unsupervised counterpart remains largely unexplored. With the growth of data volume, the expensive cost of annotations is the major limitation hindering further study. Therefore, discovering objects without annotations has great significance. However, this task seems impractical on still-image or point cloud alone due to the lack of discriminative information.

electronic proceedings, name change, unsupervised object discovery, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.63)

Add feedback

e7407ab5e89c405d28ff6807ffec594a-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 15:03:29 GMT

artificial intelligence, machine learning, point cloud, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.73)
Information Technology > Artificial Intelligence > Vision (0.51)

Add feedback

e7407ab5e89c405d28ff6807ffec594a-Paper-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 15:03:25 GMT

annotation, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
(2 more...)

Add feedback

Efficient End-to-end Visual Localization for Autonomous Driving with Decoupled BEV Neural Matching

Miao, Jinyu, Wen, Tuopu, Luo, Ziang, Qian, Kangan, Fu, Zheng, Wang, Yunlong, Jiang, Kun, Yang, Mengmeng, Huang, Jin, Zhong, Zhihua, Yang, Diange

arXiv.org Artificial IntelligenceMar-2-2025

-- Accurate localization plays an important role in high-level autonomous driving systems. Conventional map matching-based localization methods solve the poses by explicitly matching map elements with sensor observations, generally sensitive to perception noise, therefore requiring costly hyper-parameter tuning. In this paper, we propose an end-to-end localization neural network which directly estimates vehicle poses from surrounding images, without explicitly matching perception results with HD maps. T o ensure efficiency and inter-pretability, a decoupled BEV neural matching-based pose solver is proposed, which estimates poses in a differentiable sampling-based matching module. Moreover, the sampling space is hugely reduced by decoupling the feature representation affected by each DoF of poses. The experimental results demonstrate that the proposed network is capable of performing decimeter level localization with mean absolute errors of 0.19m, 0.13m and 0.39 Visual localization serves as a vital component in high-level Autonomous Driving (AD) systems due to its ability to estimate vehicle poses with an economical sensor suite. In recent decades, several works have achieved extraordinary success in terms of localization accuracy and robustness [1]. A plethora of scene maps has been developed in the domain of visual localization research, yielding varying degrees of pose estimation accuracy [1]. In conventional robotic systems, visual localization systems often employ geo-tagged frames [2], [3] and visual landmark maps [4].

bev feature, localization, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2503.00862

Country:

Asia > China > Beijing > Beijing (0.05)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (0.91)
Automobiles & Trucks (0.91)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Deep Joint Task Learning for Generic Object Extraction

Xiaolong Wang, Liliang Zhang, Liang Lin, Zhujin Liang, Wangmeng Zuo

Neural Information Processing SystemsFeb-12-2025, 01:00:55 GMT

This paper investigates how to extract objects-of-interest without relying on handcraft features and sliding windows approaches, that aims to jointly solve two subtasks: (i) rapidly localizing salient objects from images, and (ii) accurately segmenting the objects based on the localizations. We present a general joint task learning framework, in which each task (either object localization or object segmentation) is tackled via a multi-layer convolutional neural network, and the two networks work collaboratively to boost performance. In particular, we propose to incorporate latent variables bridging the two networks in a joint optimization manner. The first network directly predicts the positions and scales of salient objects from raw images, and the latent variables adjust the object localizations to feed the second network that produces pixelwise object masks. An EM-type method is presented for the optimization, iterating with two steps: (i) by using the two networks, it estimates the latent variables by employing an MCMC-based sampling method; (ii) it optimizes the parameters of the two networks unitedly via back propagation, with the fixed latent variables. Extensive experiments suggest that our framework significantly outperforms other state-of-the-art approaches in both accuracy and efficiency (e.g.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.29)

Genre:

Research Report > Promising Solution (0.34)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

4D Unsupervised Object Discovery

Neural Information Processing SystemsJan-19-2025, 04:29:41 GMT

Object discovery is a core task in computer vision. While fast progresses have been made in supervised object detection, its unsupervised counterpart remains largely unexplored. With the growth of data volume, the expensive cost of annotations is the major limitation hindering further study. Therefore, discovering objects without annotations has great significance. However, this task seems impractical on still-image or point cloud alone due to the lack of discriminative information. In this paper, we propose 4D unsupervised object discovery, jointly discovering objects from 4D data -- 3D point clouds and 2D RGB images with temporal information.

localization network, point cloud, unsupervised object discovery, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

Spatial Transformer Network YOLO Model for Agricultural Object Detection

Zambre, Yash, Rajkitkul, Ekdev, Mohan, Akshatha, Peeples, Joshua

arXiv.org Artificial IntelligenceSep-15-2024

Object detection plays a crucial role in the field of computer vision by autonomously locating and identifying objects of interest. The You Only Look Once (YOLO) model is an effective single-shot detector. However, YOLO faces challenges in cluttered or partially occluded scenes and can struggle with small, low-contrast objects. We propose a new method that integrates spatial transformer networks (STNs) into YOLO to improve performance. The proposed STN-YOLO aims to enhance the model's effectiveness by focusing on important areas of the image and improving the spatial invariance of the model before the detection process. Our proposed method improved object detection performance both qualitatively and quantitatively. We explore the impact of different localization networks within the STN module as well as the robustness of the model across different spatial transformations. We apply the STN-YOLO on benchmark datasets for Agricultural object detection as well as a new dataset from a state-of-the-art plant phenotyping greenhouse facility. Our code and dataset are publicly available.

dataset, detection, localization network, (11 more...)

arXiv.org Artificial Intelligence

2407.21652

Country: North America > United States > Texas > Brazos County > College Station (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Deep Joint Task Learning for Generic Object Extraction

Neural Information Processing SystemsMar-13-2024, 13:06:35 GMT

This paper investigates how to extract objects-of-interest without relying on handcraft features and sliding windows approaches, that aims to jointly solve two subtasks: (i) rapidly localizing salient objects from images, and (ii) accurately segmenting the objects based on the localizations. We present a general joint task learning framework, in which each task (either object localization or object segmentation) is tackled via a multi-layer convolutional neural network, and the two networks work collaboratively to boost performance. In particular, we propose to incorporate latent variables bridging the two networks in a joint optimization manner. The first network directly predicts the positions and scales of salient objects from raw images, and the latent variables adjust the object localizations to feed the second network that produces pixelwise object masks. An EM-type method is presented for the optimization, iterating with two steps: (i) by using the two networks, it estimates the latent variables by employing an MCMC-based sampling method; (ii) it optimizes the parameters of the two networks unitedly via back propagation, with the fixed latent variables. Extensive experiments suggest that our framework significantly outperforms other state-of-the-art approaches in both accuracy and efficiency (e.g.

dataset, latent variable, segmentation network, (14 more...)

Neural Information Processing Systems

Country: